New names:
• `CHX_0` -> `CHX_0...3`
• `CHX_0` -> `CHX_0...4`
• `CHX_0` -> `CHX_0...5`
• `CHX_0` -> `CHX_0...6`
• `CHX_1` -> `CHX_1...7`
• `CHX_1` -> `CHX_1...8`
• `CHX_2` -> `CHX_2...9`
• `CHX_2` -> `CHX_2...10`
• `CHX_4` -> `CHX_4...11`
• `CHX_4` -> `CHX_4...12`
• `CHX_6` -> `CHX_6...13`
• `CHX_6` -> `CHX_6...14`
• `CHX_8` -> `CHX_8...15`
• `CHX_8` -> `CHX_8...16`
• `CHX_8` -> `CHX_8...17`
• `CHX_8` -> `CHX_8...18`
PTMs and half-lives
The half-life of a protein is the time it takes for the concentration of a protein to decrease by a half. Protein half-lives can be used as estimates of residence time of proteins in the cell.
Proteins that reside longer in the cell may be more susceptible to oxidative damage.
It is assumed that each protein only has one modification. The proteins with no modifications are not identified.
Protein half-lives for short lived proteins can be found here: Proteome-wide mapping of short-lived proteins in human cells - ScienceDirect (e-bronnen.be)
Protein half-lives of long-lived proteins can be found here: Systematic analysis of protein turnover in primary cells | Nature Communications
Protein half-lives of mouse proteins can be found here: An atlas of protein turnover rates in mouse tissues | Nature Communications
- CC_Results: Results for Cricoid Cartilage (Cartilage in manuscript)
Proteins with a short half-life
Proteins can have varying half-lives
What do the half-lives depend on?
How are they measured?
Below is a comparison of the distribution of the half-lives that was found in literature and the distribution of a subset of those half-lives in the proteins found in the dataset.
Though the mean half-life of these proteins is higher than the mean half-life of the whole dataset, none of the outlier proteins can be classified as long-lived proteins. Proteins can be classified as long-lived when their mean half-life exceeds 48 hours (ref), though this is an arbitrary definition.
The enriched proteins in the peak. These are proteins with half-lives 20-25 hours:
LeadProt
1 Q15788
2 Q8WUH6
3 P38606
4 O95619
5 Q9C0D3
6 P63208
7 Q15643
8 P31629
9 Q9BV73
10 O94986
11 Q92769
12 Q15056
13 Q96MF7
14 P36954
15 Q9UQ35
16 Q13049
17 Q86U90
18 Q96G01
19 Q9H410
20 Q0VGL1
21 Q13615
22 Q96JN8
23 P49790
24 O60216
25 Q15036
26 Q9UMZ2
27 P29374
28 O95400
29 Q9P2M7
30 Q9ULV3
31 Q16206
32 Q9H3H1
33 Q8WXI9
34 O00560
35 Q9BZF9
36 Q9HCK1
37 O60315
38 Q8N1G0
39 Q96F63
40 P42771
41 Q9P2D1
42 Q7Z7K0
43 Q66GS9
44 P16220
45 Q96FZ2
46 P31321
47 O60303
48 Q9P2P6
49 Q6P0Q8
50 Q86YC2
51 Q9HBE1
52 Q9BSM1
53 O15212
54 O60671
55 Q02833
56 P28749
57 P62487
58 Q2M3G4
59 Q9H0K1
60 Q96BN2
61 P51965
62 P00374
63 O15350
64 O00459
65 P28289
66 Q9HBM1
67 O15164
68 O43303
69 Q14159
70 Q9H0A8
71 Q69YQ0
72 Q96JK2
73 O00165
74 Q9H0F6
75 Q9UQR1
76 Q9ULW3
77 Q4LE39
78 Q9HCU9
79 P51946
80 Q8IVH2
81 Q8N5I9
82 Q9H0Z9
83 O75528
84 Q16594
85 Q00537
86 Q9Y592
87 Q9H9F9
88 Q9BTT4
89 Q08999
90 Q9UPW6
91 Q14186
92 Q9UI30
93 Q8N8D1
94 P10071
95 Q8NHQ8
96 P42568
97 Q96IK1
98 Q9Y6R9
99 Q86X02
100 Q9P0K8
101 Q8N5Y2
102 Q9NS91
103 Q9BQ65
104 Q8IYN0
105 Q8TC92
106 Q96H20
107 P55789
108 Q8ND83
109 Q9C037
110 P36543
111 Q6PIY7
112 Q7L273
113 O00463
114 Q96JM7
115 Q9BQ15
116 Q6PII3
117 Q9NYR9
118 Q13487
119 O43167
120 O43739
121 Q9NPJ6
122 Q9H999
123 Q8N300
124 O96006
125 Q9BRR0
126 O15131
127 Q16342
128 Q96Q83
The proteins in that section are transcription factors. This was done accroding to the DAVID (UNIPROT_ACCESSION)
PTMs
Using genes from GenAge is ligit. Can continue doing that.
PTMs of interest:
PTMs that control autophagy
phosphorylation
ubiquitination
acetylation
oxPTMs
- you have a list of these
Methylation eg of histones
Phosphorylation
- Only the modification [21]Phospho is present here.
Splitting the dataset in a group with phosphorylation proteins and another group with all remaining proteins.
What are the phosphorylated proteins in that peak?
Got the list of genes that are associated with ageing from GenAge
Distribution of phosphorylated ageing proteins vs phosphorylated non-ageing proteins vs non-phosphorylated proteins vs non-phosphorylated ageing proteins.
Methylation
- Filtered by the [34]Methyl modification
Acetylation
- Filtered by the [1]Acetyl modification
Oxidation
- Only modification: [35]Oxidation
oxPTMs
All PTMs related to oxidative damage in general, not only [35]Oxidation.
As can be seen in the above graphs, the presence of the small peak, which represents proteins with very short half-lives, varies depending on the type of modification. It is more prominent in PTMs that are related to ageing.
Line graph
Hypothesis: proteins with higher mean half-lives remain in the cell for longer, therefore they are more susceptible to oxidative damage and will accumulate more oxPTMs.
Approach:
Count the total number of oxPTMs on each protein (using normalised counts)
Plot the mean half-life of the proteins vs their total oxPTMs count
Below is a plot that shows how the number of oxPTMs changes with the mean half-life of different proteins. The proteins, which were identified to have a very short half-life are shown in blue. A similar pattern is seen between the blue and red points, where some have a very high number of modifications, while most show very low abundances. Most proteins still seem to remain largely unmodified irrespective of their half-life.
Below we will look at the sum of total counts for each modification type
The scatter plots are more interesting if you also include IUPred scores or localisation etc.
phosphorylation
Warning: Returning more (or less) than 1 row per `summarise()` group was deprecated in
dplyr 1.1.0.
ℹ Please use `reframe()` instead.
ℹ When switching from `summarise()` to `reframe()`, remember that `reframe()`
always returns an ungrouped data frame and adjust accordingly.
`summarise()` has grouped output by 'LeadProt'. You can override using the
`.groups` argument.
Warning: Returning more (or less) than 1 row per `summarise()` group was deprecated in
dplyr 1.1.0.
ℹ Please use `reframe()` instead.
ℹ When switching from `summarise()` to `reframe()`, remember that `reframe()`
always returns an ungrouped data frame and adjust accordingly.
`summarise()` has grouped output by 'LeadProt'. You can override using the
`.groups` argument.
Warning in mean.default(pho_age_prot$sum_mod_count): argument is not numeric or
logical: returning NA
Warning: Removed 1 row containing missing values or values outside the scale range
(`geom_hline()`).
oxidation
Warning: Returning more (or less) than 1 row per `summarise()` group was deprecated in
dplyr 1.1.0.
ℹ Please use `reframe()` instead.
ℹ When switching from `summarise()` to `reframe()`, remember that `reframe()`
always returns an ungrouped data frame and adjust accordingly.
`summarise()` has grouped output by 'LeadProt'. You can override using the
`.groups` argument.
Warning: Returning more (or less) than 1 row per `summarise()` group was deprecated in
dplyr 1.1.0.
ℹ Please use `reframe()` instead.
ℹ When switching from `summarise()` to `reframe()`, remember that `reframe()`
always returns an ungrouped data frame and adjust accordingly.
`summarise()` has grouped output by 'LeadProt'. You can override using the
`.groups` argument.
Warning in mean.default(pho_age_prot$sum_mod_count): argument is not numeric or
logical: returning NA
Warning: Removed 1 row containing missing values or values outside the scale range
(`geom_hline()`).
acetyl
Warning: Returning more (or less) than 1 row per `summarise()` group was deprecated in
dplyr 1.1.0.
ℹ Please use `reframe()` instead.
ℹ When switching from `summarise()` to `reframe()`, remember that `reframe()`
always returns an ungrouped data frame and adjust accordingly.
`summarise()` has grouped output by 'LeadProt'. You can override using the
`.groups` argument.
Warning: Returning more (or less) than 1 row per `summarise()` group was deprecated in
dplyr 1.1.0.
ℹ Please use `reframe()` instead.
ℹ When switching from `summarise()` to `reframe()`, remember that `reframe()`
always returns an ungrouped data frame and adjust accordingly.
`summarise()` has grouped output by 'LeadProt'. You can override using the
`.groups` argument.
oxPTMs
Bar chart
Hypothesis: The higher the half-life, the greater the number of PTMs.
Warning: Returning more (or less) than 1 row per `summarise()` group was deprecated in
dplyr 1.1.0.
ℹ Please use `reframe()` instead.
ℹ When switching from `summarise()` to `reframe()`, remember that `reframe()`
always returns an ungrouped data frame and adjust accordingly.
`summarise()` has grouped output by 'hl_group', 'mod_group'. You can override
using the `.groups` argument.
Proteins with a long half-life
Long-lived proteins can be used as estimators of chronological age. Long-lived proteins can be defined in different ways, for example based on the half-life of the protein when compared to the average half-life of proteins in the organism. In this case, long-lived proteins were obtained from the following study: paper. Proteins were classified as long-lived based on their degree of degradation during the experiment and therefore it was possible to discover new long-lived proteins (no a priori assumptions were made).
The study identified a list of long-lived proteins in rats, therefore human orthologs of these proteins were found.
PTMs
Phosphorylation
Methylation
Warning: Removed 494739 rows containing non-finite outside the scale range
(`stat_density()`).
Warning: Removed 44297 rows containing non-finite outside the scale range
(`stat_density()`).
Warning: Removed 29244 rows containing non-finite outside the scale range
(`stat_density()`).
Warning: Removed 3624 rows containing non-finite outside the scale range
(`stat_density()`).
Acetylation
Warning: Removed 525274 rows containing non-finite outside the scale range
(`stat_density()`).
Warning: Removed 16410 rows containing non-finite outside the scale range
(`stat_density()`).
Warning: Removed 31892 rows containing non-finite outside the scale range
(`stat_density()`).
Warning: Removed 976 rows containing non-finite outside the scale range
(`stat_density()`).
Oxidation
Warning: Removed 496997 rows containing non-finite outside the scale range
(`stat_density()`).
Warning: Removed 42625 rows containing non-finite outside the scale range
(`stat_density()`).
Warning: Removed 29830 rows containing non-finite outside the scale range
(`stat_density()`).
Warning: Removed 3038 rows containing non-finite outside the scale range
(`stat_density()`).
oxPTMs
Warning: Removed 442146 rows containing non-finite outside the scale range
(`stat_density()`).
Warning: Removed 94211 rows containing non-finite outside the scale range
(`stat_density()`).
Warning: Removed 26565 rows containing non-finite outside the scale range
(`stat_density()`).
Warning: Removed 6303 rows containing non-finite outside the scale range
(`stat_density()`).
B cells
cytoplasmic projects:
old B cells: PXD006570
young B cells:PXD006572
nuclear proteins:
old B cells:PXD006571
young B cells:PXD006576
Distribution of the half-lives
Take the cytoplasmic proteins and plot 2 density plots. Compare the distributions of the half-lives between the
PTMs
Phosphorylation
For cytoplasmic proteins what are the differences between old and young proteins?
human_ptms_cyto_old_hl_pho <- human_ptms_cyto_old_hl %>% filter(unimod_id == 21)
human_ptms_cyto_old_hl_no_pho <- human_ptms_cyto_old_hl %>% filter(!unimod_id == 21)
human_ptms_cyto_young_hl_no_pho <- human_ptms_cyto_young_hl %>% filter(!unimod_id == 21)
human_ptms_cyto_young_hl_pho <- human_ptms_cyto_young_hl %>% filter(unimod_id == 21)
human_ptms_nuc_old_hl_pho <- human_ptms_nuc_old_hl %>% filter(unimod_id == 21)
human_ptms_nuc_old_hl_no_pho <- human_ptms_nuc_old_hl %>% filter(!unimod_id == 21)
human_ptms_nuc_young_hl_pho <- human_ptms_nuc_young_hl %>% filter(unimod_id == 21)
human_ptms_nuc_young_hl_no_pho <- human_ptms_nuc_young_hl %>% filter(!unimod_id == 21)
ggplot() +
geom_density(data = human_ptms_cyto_old_hl_pho, aes(x = mean_hl_hours, weight = norm_counts, fill = 'human_ptms_cyto_old_hl_pho'), alpha = 0.7, bw = 2) +
geom_density(data = human_ptms_cyto_old_hl_no_pho, aes(x = mean_hl_hours, weight = norm_counts,fill = 'human_ptms_cyto_old_hl_no_pho'), alpha = 0.7, bw = 2) +
geom_density(data = human_ptms_cyto_young_hl_no_pho, aes(x = mean_hl_hours, weight = norm_counts,fill = 'human_ptms_cyto_young_hl_no_pho'), alpha = 0.7, bw = 2) +
geom_density(data = human_ptms_cyto_young_hl_pho, aes(x = mean_hl_hours, weight = norm_counts,fill = 'human_ptms_cyto_young_hl_pho'), alpha = 0.7, bw = 2) +
labs(x = 'Mean half-lives (hours)', y = 'Density') +
scale_x_continuous(limits = c(0,100)) +
theme_classic() +
scale_fill_manual(values = c('human_ptms_cyto_old_hl_pho' = '#EA7317', 'human_ptms_cyto_old_hl_no_pho' = '#FFB703','human_ptms_cyto_young_hl_no_pho' = "#5DB7B1",'human_ptms_cyto_young_hl_pho' = '#3DA5D9'), name = 'Legend') # Manually specify fill colorsWarning: Removed 51 rows containing non-finite outside the scale range
(`stat_density()`).
Warning: Removed 505 rows containing non-finite outside the scale range
(`stat_density()`).
Warning: Removed 398 rows containing non-finite outside the scale range
(`stat_density()`).
Warning: Removed 30 rows containing non-finite outside the scale range
(`stat_density()`).
Methylation
Acetylation
Oxidation
oxPTMs
Approach:
Get a list of PTMs that correlate with ageing such as oxPTMs, acetylation etc.
Test whether the abundance of these PTMs changes between the long-lived proteins and the normal proteins.
Bar chart
Is there less phosphorylation in young cells compared to old cells?
Warning: Returning more (or less) than 1 row per `summarise()` group was deprecated in
dplyr 1.1.0.
ℹ Please use `reframe()` instead.
ℹ When switching from `summarise()` to `reframe()`, remember that `reframe()`
always returns an ungrouped data frame and adjust accordingly.
`summarise()` has grouped output by 'hl_group', 'age_group'. You can override
using the `.groups` argument.
Acetylation
df_1 <- human_ptms_nuc_old_hl %>% mutate(age_group = 'nuc_old') %>% filter(unimod_id == 1)
df_2 <- human_ptms_nuc_young_hl %>% mutate(age_group = 'nuc_young') %>% filter(unimod_id == 1)
df_3 <- human_ptms_cyto_old_hl %>% mutate(age_group = 'cyto_old') %>% filter(unimod_id == 1)
df_4 <- human_ptms_cyto_young_hl %>% mutate(age_group = 'cyto_young') %>% filter(unimod_id == 1)
df <- rbind(df_1, df_2, df_3, df_4)
df <- df %>%
mutate(hl_group = case_when(
mean_hl_hours <= 50 ~ "0-50",
mean_hl_hours <= 100 ~ "50-100",
mean_hl_hours <= 150 ~ "100-150",
mean_hl_hours <= 200 ~ "150-200",
mean_hl_hours <= 250 ~ "200-250",
TRUE ~ "250+"
))
mean_hours_per_hl_group <- df %>%
group_by(hl_group, age_group) %>%
summarize(age_group, mean_ptms_group = mean(norm_counts)) %>% distinct()Warning: Returning more (or less) than 1 row per `summarise()` group was deprecated in
dplyr 1.1.0.
ℹ Please use `reframe()` instead.
ℹ When switching from `summarise()` to `reframe()`, remember that `reframe()`
always returns an ungrouped data frame and adjust accordingly.
`summarise()` has grouped output by 'hl_group', 'age_group'. You can override
using the `.groups` argument.
mod_group_colours <- c('nuc_old' = '#EA7317', 'cyto_old' = '#2364AA', 'nuc_young' = '#F9AE8B', 'cyto_young' = '#219EBC')
# Plot bar chart
ggplot(mean_hours_per_hl_group, aes(x = hl_group, y = mean_ptms_group, fill = age_group)) +
geom_bar(stat = "identity", position = 'dodge') +
scale_x_discrete(limits = c("0-50", "50-100", "100-150", "150-200", "200-250", "250+")) +
scale_fill_manual(values = mod_group_colours, name = 'Key') +
labs(x = "Half-lives (hours)",
y = "Mean sum of normalised PTM counts") +
theme_classic()Phosphorylation
df_1 <- human_ptms_nuc_old_hl %>% mutate(age_group = 'nuc_old') %>% filter(unimod_id == 21)
df_2 <- human_ptms_nuc_young_hl %>% mutate(age_group = 'nuc_young') %>% filter(unimod_id == 21)
df_3 <- human_ptms_cyto_old_hl %>% mutate(age_group = 'cyto_old') %>% filter(unimod_id == 21)
df_4 <- human_ptms_cyto_young_hl %>% mutate(age_group = 'cyto_young') %>% filter(unimod_id == 21)
df <- rbind(df_1, df_2, df_3, df_4)
df <- df %>%
mutate(hl_group = case_when(
mean_hl_hours <= 50 ~ "0-50",
mean_hl_hours <= 100 ~ "50-100",
mean_hl_hours <= 150 ~ "100-150",
mean_hl_hours <= 200 ~ "150-200",
mean_hl_hours <= 250 ~ "200-250",
TRUE ~ "250+"
))
mean_hours_per_hl_group <- df %>%
group_by(hl_group, age_group) %>%
summarize(age_group, mean_ptms_group = mean(norm_counts)) %>% distinct()Warning: Returning more (or less) than 1 row per `summarise()` group was deprecated in
dplyr 1.1.0.
ℹ Please use `reframe()` instead.
ℹ When switching from `summarise()` to `reframe()`, remember that `reframe()`
always returns an ungrouped data frame and adjust accordingly.
`summarise()` has grouped output by 'hl_group', 'age_group'. You can override
using the `.groups` argument.
mod_group_colours <- c('nuc_old' = '#EA7317', 'cyto_old' = '#2364AA', 'nuc_young' = '#F9AE8B', 'cyto_young' = '#219EBC')
# Plot bar chart
ggplot(mean_hours_per_hl_group, aes(x = hl_group, y = mean_ptms_group, fill = age_group)) +
geom_bar(stat = "identity", position = 'dodge') +
scale_x_discrete(limits = c("0-50", "50-100", "100-150", "150-200", "200-250", "250+")) +
scale_fill_manual(values = mod_group_colours, name = 'Key') +
labs(x = "Half-lives (hours)",
y = "Mean sum of normalised PTM counts") +
theme_classic()Look at oxPTMs
df_1 <- human_ptms_nuc_old_hl %>% mutate(age_group = 'nuc_old') %>% filter(unimod_id %in% oxPTMs$ID)
df_2 <- human_ptms_nuc_young_hl %>% mutate(age_group = 'nuc_young') %>% filter(unimod_id %in% oxPTMs$ID)
df_3 <- human_ptms_cyto_old_hl %>% mutate(age_group = 'cyto_old') %>% filter(unimod_id %in% oxPTMs$ID)
df_4 <- human_ptms_cyto_young_hl %>% mutate(age_group = 'cyto_young') %>% filter(unimod_id %in% oxPTMs$ID)
df <- rbind(df_1, df_2, df_3, df_4)
df <- df %>%
mutate(hl_group = case_when(
mean_hl_hours <= 50 ~ "0-50",
mean_hl_hours <= 100 ~ "50-100",
mean_hl_hours <= 150 ~ "100-150",
mean_hl_hours <= 200 ~ "150-200",
mean_hl_hours <= 250 ~ "200-250",
TRUE ~ "250+"
))
mean_hours_per_hl_group <- df %>%
group_by(hl_group, age_group) %>%
summarize(age_group, mean_ptms_group = mean(norm_counts)) %>% distinct()Warning: Returning more (or less) than 1 row per `summarise()` group was deprecated in
dplyr 1.1.0.
ℹ Please use `reframe()` instead.
ℹ When switching from `summarise()` to `reframe()`, remember that `reframe()`
always returns an ungrouped data frame and adjust accordingly.
`summarise()` has grouped output by 'hl_group', 'age_group'. You can override
using the `.groups` argument.
mod_group_colours <- c('nuc_old' = '#EA7317', 'cyto_old' = '#2364AA', 'nuc_young' = '#F9AE8B', 'cyto_young' = '#219EBC')
# Plot bar chart
ggplot(mean_hours_per_hl_group, aes(x = hl_group, y = mean_ptms_group, fill = age_group)) +
geom_bar(stat = "identity", position = 'dodge') +
scale_x_discrete(limits = c("0-50", "50-100", "100-150", "150-200", "200-250", "250+")) +
scale_fill_manual(values = mod_group_colours, name = 'Key') +
labs(x = "Half-lives (hours)",
y = "Mean sum of normalised PTM counts") +
theme_classic()